FedKL: Tackling Data Heterogeneity in Federated Reinforcement Learning by Penalizing KL Divergence

نویسندگان

چکیده

One of the fundamental issues for Federated Learning (FL) is data heterogeneity, which causes accuracy degradation, slow convergence, and communication bottleneck issue. Although impact heterogeneity on supervised FL has been widely studied, related investigation Reinforcement (FRL) still in its infancy. In this paper, we first define type level FRL systems. By inspecting connection between global local objective functions, prove that training can benefit objective, if update properly penalized by total variation (TV) distance policies. A necessary condition policy to be learn-able from environments also derived, directly level. Based theoretical result, a Kullback-Leibler (KL) divergence based penalty proposed constrain model outputs distribution space convergence proof algorithm provided. jointly penalizing with each iteration penalty, method achieves better trade-off speed (step size) convergence. Experiment results two popular (RL) experiment platforms demonstrate advantage over existing methods accelerating stabilizing process heterogeneous data.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transductive Transfer Learning Based on Kl-divergence

Transfer learning solves the problem that the training data from a source domain and the test data from a target domain follow different distributions. In this paper, we take advantage of existing well labeled data and introduce them as sources into a novel transductive transfer learning framework. We first construct two feature mapping functions based on mutual information to re-weight the tra...

متن کامل

Color Constancy Using KL-Divergence

Color is a useful feature for machine vision tasks. However, its effectiveness is often limited by the fact that the measured pixel values in a scene are influenced by both object surface reflectance properties and incident illumination. Color constancy algorithms attempt to compute color features which are invariant of the incident illumination by estimating the parameters of the global scene ...

متن کامل

Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning

We present a framework combining hierarchical and multi-agent deep reinforcement learning approaches to solve coordination problems among a multitude of agents using a semi-decentralized model. The framework extends the multi-agent learning setup by introducing a meta-controller that guides the communication between agent pairs, enabling agents to focus on communicating with only one other agen...

متن کامل

Lecture 3 : KL - divergence and connections

1 Recap Recall some important facts about entropy and mutual information from the previous lecture: • H(X,Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ) • I(X;Y ) = H(X)−H(X|Y ) = H(Y )−H(Y |X) = H(X) + H(Y )−H(X,Y ) • I(X;Y |Z) = H(X|Z)−H(X|Y,Z) • I(X;Y ) = 0 if X and Y are independent • I(X;Y ) ≥ 0 or, equivalently, H(X) ≥ H(X|Y ) Exercise 1.1 Prove that H(X|Y ) = 0 if and only if X = g(Y ) for some ...

متن کامل

Using KL Divergence for Credibility Assessment

In reputation systems, agents collectively estimate the others’ behaviours through feedbacks to decide with whom they can interact. To avoid manipulations, most reputation systems weight feedbacks with respect to the agents’ reputation. However, these systems are sensitive to some strategic manipulations, like oscillating attacks or whitewashing. In this paper, we propose (1) a credibility meas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Journal on Selected Areas in Communications

سال: 2023

ISSN: ['0733-8716', '1558-0008']

DOI: https://doi.org/10.1109/jsac.2023.3242734